Competitively Evolving Decision Trees Against Fixed Training Cases for Natural Language Processing
نویسندگان
چکیده
Competitive fitness functions can generate performance superior to absolute fitness functions [Angeline and Pollack 1993], [Hillis 1992]. This chapter describes a method by which competition can be implemented when training over a fixed (static) set of examples. Since new training cases cannot be generated by mutation or crossover, the probabilistic frequencies by which individual training cases are selected competitively adapt. We evolve decision trees for the problem of word sense disambiguation. The decision trees contain embedded bit strings; bit string crossover is intermingled with subtree-swapping. To approach the problem of overlearning, we have implemented a fitness penalty function specialized for decision trees which is dependent on the partition of the set of training cases implied by a decision tree.
منابع مشابه
Tail-Recursive Distributed Representations and Simple Recurrent Networks
Representation poses important challenges to connectionism. The ability to structurally compose representations is critical in achieving the capability considered necessary for cognition. We are investigating distributed patterns that represent structure as part of a larger effort to develop a natural language processor. Recursive Auto-Associative Memory (RAAM) representations show unusual prom...
متن کاملFast and Accurate Decision Trees for Natural Language Processing Tasks
Decision trees have long been used in many machine-learning tasks; they have a clear structure that provides insight into the training data and are simple to conceptually understand and implement. We present an optimized tree-computation algorithm based on the original ID3 algorithm. We introduce a tree-pruning method that uses the development set to delete nodes from overfitted models, as well...
متن کاملAbstraction is Harmful in Language Learning
The usual approach to learning language processing tasks such as tagging, parsing, grapheme-to-phoneme conversion, pp-attachrnent, etc., is to extract regularities from training data in the form of decision trees, rules, probabilities or other abstractions. These representations of regularities are then used to solve new cases of the task. The individual training examples on which the abstracti...
متن کاملPart-of-Speech Tagging Using Decision Trees
We have applied inductive learning of statistical decision trees to the Natural Language Processing (NLP) task of morphosyn-tactic disambiguation (Part Of Speech Tagging). Previous work showed that the acquired language models are independent enough to be easily incorporated, as a statistical core of rules, in any exible tagger. They are also complete enough to be directly used as sets of POS d...
متن کاملClassification algorithms applied to narrative reports
Narrative text reports represent a significant source of clinical data. However, the information stored in these reports is inaccessible to many automated decision support systems. Data mining techniques can assist in extracting information from narrative data. Multiple classification methods, such as rule generation, decision trees, Bayesian classifiers, and information retrieval were used to ...
متن کامل